1,914 research outputs found
An overview of textual semantic similarity measures based on web intelligence
Computing the semantic similarity between terms (or short text expressions) that have the same meaning but which are not lexicographically similar is a key challenge in many computer related fields. The problem is that traditional approaches to semantic similarity measurement are not suitable for all situations, for example, many of them often fail to deal with terms not covered by synonym dictionaries or are not able to cope with acronyms, abbreviations, buzzwords, brand names, proper nouns, and so on. In this paper, we present and evaluate a collection of emerging techniques developed to avoid this problem. These techniques use some kinds of web intelligence to determine the degree of similarity between text expressions. These techniques implement a variety of paradigms including the study of co-occurrence, text snippet comparison, frequent pattern finding, or search log analysis. The goal is to substitute the traditional techniques where necessary
A Survey on Legal Question Answering Systems
Many legal professionals think that the explosion of information about local,
regional, national, and international legislation makes their practice more
costly, time-consuming, and even error-prone. The two main reasons for this are
that most legislation is usually unstructured, and the tremendous amount and
pace with which laws are released causes information overload in their daily
tasks. In the case of the legal domain, the research community agrees that a
system allowing to generate automatic responses to legal questions could
substantially impact many practical implications in daily activities. The
degree of usefulness is such that even a semi-automatic solution could
significantly help to reduce the workload to be faced. This is mainly because a
Question Answering system could be able to automatically process a massive
amount of legal resources to answer a question or doubt in seconds, which means
that it could save resources in the form of effort, money, and time to many
professionals in the legal sector. In this work, we quantitatively and
qualitatively survey the solutions that currently exist to meet this challenge.Comment: 57 pages, 1 figure, 10 table
Automatic Design of Semantic Similarity Ensembles Using Grammatical Evolution
Semantic similarity measures are widely used in natural language processing
to catalyze various computer-related tasks. However, no single semantic
similarity measure is the most appropriate for all tasks, and researchers often
use ensemble strategies to ensure performance. This research work proposes a
method for automatically designing semantic similarity ensembles. In fact, our
proposed method uses grammatical evolution, for the first time, to
automatically select and aggregate measures from a pool of candidates to create
an ensemble that maximizes correlation to human judgment. The method is
evaluated on several benchmark datasets and compared to state-of-the-art
ensembles, showing that it can significantly improve similarity assessment
accuracy and outperform existing methods in some cases. As a result, our
research demonstrates the potential of using grammatical evolution to
automatically compare text and prove the benefits of using ensembles for
semantic similarity tasks. The source code that illustrates our approach can be
downloaded from https://github.com/jorge-martinez-gil/sesige.Comment: 29 page
Framework to Automatically Determine the Quality of Open Data Catalogs
Data catalogs play a crucial role in modern data-driven organizations by
facilitating the discovery, understanding, and utilization of diverse data
assets. However, ensuring their quality and reliability is complex, especially
in open and large-scale data environments. This paper proposes a framework to
automatically determine the quality of open data catalogs, addressing the need
for efficient and reliable quality assessment mechanisms. Our framework can
analyze various core quality dimensions, such as accuracy, completeness,
consistency, scalability, and timeliness, offer several alternatives for the
assessment of compatibility and similarity across such catalogs as well as the
implementation of a set of non-core quality dimensions such as provenance,
readability, and licensing. The goal is to empower data-driven organizations to
make informed decisions based on trustworthy and well-curated data assets. The
source code that illustrates our approach can be downloaded from
https://www.github.com/jorge-martinez-gil/dataq/.Comment: 25 page
A Novel Approach for Learning How to Automatically Match Job Offers and Candidate Profiles
Automatic matching of job offers and job candidates is a major problem for a
number of organizations and job applicants that if it were successfully
addressed could have a positive impact in many countries around the world. In
this context, it is widely accepted that semi-automatic matching algorithms
between job and candidate profiles would provide a vital technology for making
the recruitment processes faster, more accurate and transparent. In this work,
we present our research towards achieving a realistic matching approach for
satisfactorily addressing this challenge. This novel approach relies on a
matching learning solution aiming to learn from past solved cases in order to
accurately predict the results in new situations. An empirical study shows us
that our approach is able to beat solutions with no learning capabilities by a
wide margin.Comment: 15 pages, 6 figure
Fuzzy Logics for Multiple Choice Question Answering
We have recently witnessed how solutions based on neural-inspired architectures are the most popular in terms of Multiple-Choice Question Answering. However, solutions of this kind are difficult to interpret, require many resources for training, and present obstacles to transferring learning. In this work, we move away from this mainstream to explore new methods based on fuzzy logic that can cope with these problems. The results that can be obtained are in line with those of the neural cutting solutions, but with advantages such as their ease of interpretation, the low cost concerning the resources needed for training as well as the possibility of transferring the knowledge acquired in a much more straightforward and more intuitive way
AI-Based Recruiting: The Future Ahead
The Human Resources industry is currently being revolutionized by the automation of tedious and time-consuming aspects of their processes. Since AI paradigms such as deep neural networks and other machine learning methods can make accurate predictions and analyze vast amounts of information, these technologies are suitable for facing some of the major challenges in this domain. We overview here how this industry is changing; from the automatic screening of the candidates to bias removal in most of the processes, through techniques for the automatic discovery of potential employees or new advances for improving the candidate's experience
NEFUSI: NeuroFuzzy Similarity. Final Report
This research work presents the final report for the NEFUSI project. In fact, we present here our research findings on building neurofuzzy models that automatically evaluate semantic textual similarity in an accurate and timely manner. We show that neural networks and fuzzy logic have different features that make them suitable for certain problems but unsuitable for others. Neural networks, on the one hand, are valuable tools for identifying patterns. However, they need to make it easier for people to comply with the decisions. On the other hand, interpretation is possible within fuzzy logic systems, but they cannot automatically derive the rules they use to make those decisions. These constraints served as the primary reason for developing a novel intelligent hybrid system, which combines two approaches to circumvent the individual effects of both limitations simultaneously
- …